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Abstract —Given a propositional formula F(x,y), a Skolem 
function for a: is a function v>(y), such that substituting ip{y) for 
x in F gives a formula semantically equivalent to 3x F. Auto¬ 
matically generating Skolem functions is of significant interest 
in several applications including certified QBF solving, finding 
strategies of players in games, synthesising circuits and bit- 
vector programs from specifications, disjunctive decomposition of 
sequential circuits, etc. In many such applications, F is given as 
a conjunction of factors, each of which depends on a small subset 
of variables. Existing algorithms for Skolem function generation 
ignore any such factored form and treat F as a monolithic 
function. This presents scalability hurdles in medium to large 
problem instances. In this paper, we argue that exploiting the 
factored form of F can give significant performance improve¬ 
ments in practice when computing Skolem functions. We present 
a new CEGAR style algorithm for generating Skolem functions 
from factored propositional formulas. In contrast to earlier work, 
our algorithm neither requires a proof of QBF satisfiability 
nor uses composition of monolithic conjunctions of factors. We 
show experimentally that our algorithm generates smaller Skolem 
functions and outperforms state-of-the-art approaches on several 
large benchmarks. 

I. Introduction 

Skolem functions, introduced by Thoralf Skolem in the 
1920s, occupy a central role in mathematical logic. Formally, 
let F(x, y) be a first-order logic formula, and let dom(x) and 
dom(y) denote the domains of x and y respectively. A Skolem 
function for x in F is a function : dom(y) —> dom(x) such 
that substituting ip(y) for x in F yields a formula semantically 
equivalent to 3xF(x,y), i.e. F(tp(y),y) = 3xF(x,y). In 
this paper, we focus on the case where the formula F is 
propositional and given as a conjunction of factors. Classically, 
Skolem functions have been used in proving theorems in logic. 
More recently, with the advent of fast SAT/SMT solvers, it 
has been shown that several practically relevant problems can 
be encoded as quantified formulas, and can be solved by 
constructing realizers of quantified variables. We identify these 
realizers as specific instances of Skolem functions, and focus 
on algorithms for constructing them in this paper. 

We begin by listing some applications that illustrate the util¬ 
ity of constructing instances of Skolem functions in practice. 

1) Quantifier elimination. Given a quantified formula 
Qx F{x, y ), where Q £ {3, V}, the quantifier elimination 
problem requires us to find a quantifier-free formula 
that is semantically equivalent to Qx F(x. y ). Quantifier 
elimination has important applications in diverse areas 
(see, e.g. 0, 031. 0 for a sampling). It follows from 
the definition of Skolem function that eliminating the 
quantifier from 3xF[x, y) can be achieved by substituting 


x with a Skolem function for x. Since \/xF(x , y) can be 
written as -Ga :—iF(x,y), the same idea applies in this 
case too. In fact, the process can be repeated in principle 
to eliminate quantifiers from a formula with arbitrary 
quantifier prefix. 

2) Controller Synthesis and Games. Control-program syn¬ 
thesis in the Ramadge-Wonham lfT2l framework reduces 
to games between two players—environment and the 
controller—such that the optimal strategy of the controller 
corresponds to an optimal control program. The optimal 
(or winning) strategy of the controller corresponds to 
choosing values of variables controlled by it such that 
regardless of the way the environment fixes its variables, 
the resulting play satisfies the controller’s objective. If 
the rules of the game are encoded as a propositional 
formula and if the strategy space for both players is 
finite, the optimal strategy of the controller corresponds 
to finding Skolem functions of variables controlled by 
it. In fact, for a number of two-player games—such as 
reachability games and safety games 0, tic-tac-toe Q 
and chess-like games 0, 0 —the problem of deciding 
a winner can be reduced to checking satisfiability of a 
quantified Boolean formula (QBF), and the problem of 
finding winning or best-effort strategy reduces to Skolem 
function generation. 

3) Graph Decomposition. Skolem functions can be used to 
compute disjunctive decompositions of implicitly speci¬ 
fied state transition graphs of sequential circuits [Q6|. The 
disjunctive decomposition problem asks the following 
question: Given a sequential circuit, derive “component” 
sequential circuits, each of which has the same state space 
as the original circuit, but only a subset of transitions 
going out of every state. The components should be such 
that the complete set of state transitions of the original 
circuit is the union of the sets of state transitions of 
the components. Disjunctive decompositions have been 
shown to be useful in efficient reachability analysis 03 . 

There are several other practical applications where Skolem 
functions find use; see, e.g. El, for a discussion. Hence, there 
is a growing need for practically efficient and scalable ap¬ 
proaches for generating instances of Skolem functions. Large 
and complex representations of the formula F in 3a: F often 
present scalability hurdles in generating Skolem functions 
in practice. Interestingly, for several problem instances, the 
specification of F is available in a factored form, i.e., as a 
conjunction of simpler sub-formulas, each of which depends 


on a subset of variables appearing in F. Unfortunately, unlike 
in the case of disjunction, existential quantification does not 
distribute over conjunction of sub-formulas. Existing algo¬ 
rithms therefore ignore any factored form of F and treat the 
conjunction of factors as a single monolithic function. We 
show in this paper that exploiting the factored form can help 
significantly when generating Skolem functions. 

Our main technical contribution is a SAT-based Counter- 
Example Guided Abstraction-Refinement (CEGAR) algorithm 
for generating Skolem functions from factored formulas. Un¬ 
like competing approaches, our algorithm exploits the factored 
representation of a formula and leverages advances made in 
SAT-solving technology. The factored representation is used 
to arrive at an initial abstraction of Skolem functions, while a 
SAT-solver is used as an oracle to identify counter-examples 
that are used to refine the Skolem functions until no counter¬ 
examples exist. We present a detailed experimental evaluation 
of our algorithm vis-a-vis state-of-the-art algorithms q, m 
over a large class of benchmarks. We show that on several 
large problem instances, we outperform competing algorithms. 

Related Work. We are not aware of other techniques for 
Skolem function generation that exploit the factored form 
of a formula. Earlier work on Skolem function generation 
broadly fall in one of four categories. The first category 
includes techniques that extract Skolem functions from a proof 
of validity of 3A F{X,Y) O, GQ, 0 , @. In problem 
instances where 3X F(X,Y ) is valid (and this forms an 
important sub-class of problems), these techniques can usually 
find succinct Skolem functions if there exists a short proof 
of validity. However, in several other important classes of 
problems, the formula 3 A F(X,Y ) does not evaluate to 
true for all values of Y, and techniques in the first category 
cannot be applied. The second category includes techniques 
that use templates for candidate Skolem functions |fl4]| . These 
techniques are effective only when the set of candidate Skolem 
functions is known and small. While this is a reasonable 
assumption in some domains m. it is not in most other 
domains. BDD-based techniques lfl3ll are yet another way to 
compute Skolem functions. Unfortunately, these techniques 
are known not to scale well, unless custom-crafted variable 
orders are used. The last category includes techniques that 
use cofactors to obtain Skolem functions 0, m. These 
techniques do not exploit the factored representation of a 
formula and, as we show experimentally, do not scale well 
to large problem instances. 

II. Preliminaries 

We use lower case letters (possibly with subscripts) to 
denote propositional variables, and upper case letters to 
denote sequences of such variables. We use 0 and 1 to 
denote the propositional constants false and true, respec¬ 
tively. Let F(X,Y) be a propositional formula, where A' 
and Y denote the sequences of variables (xi,...,x n ) and 
(j/i,..., y m ), respectively. We are interested in problem in¬ 
stances where F(X, Y) is given as a conjunction of factors 


f 1 (Xi,Y 1 ),...,f r (X r ,Y r ), where each A, (resp., Yf) is 
a possibly empty sub-sequence of A' (resp., Y). For no- 
tational convenience, we use F and Aj=i f' 1 interchange¬ 
ably throughout this paper. The set of variables in F is 
called the support of F, and is denoted Supp(F). Given 
a propositional formula F(X) and a propositional function 
\k(A), we use F[xi/ i ^{X)], or simply F[xi/^], to denote 
the formula obtained by substituting every occurrence of the 
variable x, in F with ft (A). Since the notions of formulas 
and functions coincide in propositional logic, the above is also 
conventionally called function composition. If A' is a sequence 
of variables and Xi is a variable, we use A \ x t to denote 
the sub-sequence of X obtained by removing Xi (if present) 
from A. Abusing notation, we use X to also denote the set 
of elements in A, when there is no confusion. A valuation or 
assignment n of A is a mapping it : X —> {0,1}. 

Definition 1 . Given a propositional formula F(X. Y ) and a 
variable Xi £ X, a Skolem function for x-i in F( A, Y) is a 
function tp{X \ xt,Y ) such that 3xi F = F[x-i/tp], 

A Skolem function for x, in F need not be unique. The 
following proposition, which effectively follows from 0, ®, 
characterizes the space of all Skolem functions for Xi in F. 

Proposition 1. A function tp(X \xi,Y) is a Skolem function 
for Xi in F(X,Y) iff F[xi/ 1] A —>F[xi/ 0] => tp and tp =>■ 
F[xi/ 1] V -iF[xi/0], 

The function F[xi/ 0] (resp., F[xi/\]) is called the positive 
(resp., negative ) cofactor of F with respect to x,, and plays a 
central role in the study of Skolem functions for propositional 
formulas. In particular, it follows from Proposition [7] that 
F[xi/V\ is a Skolem function for Xi in F. The above definition 
for a single variable can be naturally extended to a vector of 
variables. Given F(X,Y), a Skolem function vector for A = 
(x ±,..., i n ) in F is a vector of functions ft = (tpi ,..., tp n ) 
such that 3x 1 ...x n F = (• • • (F[xi/tpi\) ■ ■ ■ [x n /tp n ]). A 
straightforward way to obtain a Skolem function vector ft is 
to first obtain a Skolem function tpi for x\ in F, then compute 
F' = 3x. [ F and obtain a Skolem function tp 2 for ./a in F\ and 
so on until tp n has been obtained. More formally, tpi can be 
computed as a Skolem function for Xi in 3xi ... x,_i F, start¬ 
ing from tpi and proceeding to tp n . Note that 3x\... Xj_i F 
can itself be computed as (• ■ • (F[xi/tpi\) ■ ■ ■ \xi-i/tpi_i\). 

Definition 2. The “Can’t-be-1” function for x t in F, de¬ 
noted Cbl[xi](F), is defined to be (~i3x\.. .Xi-\F)[xi/l\. 
Similarly, the “Can’t-be-0” function for Xi in F, denoted 
CbO[xi](F), is defined to be (-i3x \... Xi-i F) [xj/0]. When 
X and F are clear from the context, we use Cbl[z] and CbO[i] 
for Cbl[x, ; ](F) and CbO[xj](F), respectively. 

Intuitively, in order to make F evaluate to 1, we cannot set a;,; 
to 1 (resp. 0) whenever the valuation of {x’i+i,..., x n }LiY sat¬ 
isfies Cb 1 [i] (resp., CbO[z]). The following proposition follows 
from Definition [2] and from our observation about computing 
a Skolem function vector one component at a time. 


Proposition 2. \&=(-iCbl[l],..., -iCbl[n]) is a Skolem func¬ 
tion vector for X in F. 

Note that the support of 0j in \P, as given by Proposition [2] 
is {xi+ 1 ,..., Xn] U Y. If we want a Skolem function vector 
such that every component function has only Y (or a 
subset thereof) as support, this can be obtained by repeatedly 
substituting the Skolem function for every variable Xi in all 
other Skolem functions where x, appears. We denote such a 
Skolem function vector as \P(Y). 

III. A MONOLITHIC COMPOSITION BASED ALGORITHM 

Our algorithm is motivated in part by cofactor-based tech¬ 
niques for computing Skolem functions, as proposed by Jiang 
et al 0 and Trivedi Q6j. Given F{X, Y) = /\ r j=1 p{Xj,Yj), 
the techniques of 0, ED essentially compute a Skolem 
function vector M>(Y) for X in F as shown in algorithm 
MonoSkolem (see Algorithm [TJ. In this algorithm, the 
variables in X are assumed to be ordered by their indices. 
While variable ordering is known to affect the difficulty of 
computing Skolem functions 0, we assume w.l.o.g. that 
the variables are indexed to represent a desirable order. We 
describe the variable order used in our study later in SectionlVl 

MonoSkolem works in two phases. In the first phase, it 
implements a straightforward strategy for obtaining a Skolem 
function vector, as suggested by Proposition [2] Specifically, 
steps 3 and 4 of MonoSkolem build a monolithic conjunc¬ 
tion Fj of all factors that have x t in their support, before 
computing ipi. This restricts the scope of the quantifier for 
Xi to the conjunction of these factors. In Step 6, we use 
—iCbl[z] as a specific choice for the Skolem function ipi. After 
computing ipi from Fi, step 7 discards the factors with x. t 
in their support, and introduces a single factor representing 
3xi Fi (computed as F,\xi/ipi]) in their place. Note that each 
ipi obtained in this manner has {xi+ 1 ,..., x n }UY (or a subset 
thereof) as support. Since we want each Skolem function to 
have support Y, a second phase of “reverse” substitutions is 
needed. In this phase (see Algorithm 0, the Skolem function 
ip n (Y) obtained above is substituted for x n in ipi,..., ip n -i- 
This effectively renders all Skolem functions independent of 
x n . The process is then repeated with ip n -1 substituted for 
x n -\ in ipi ,..., ipn —2 and so on, until all Skolem functions 
have been made independent of xi,..., x n , and have only Y 
(or subsets thereof) as support. 

MonoSkolem can be further refined by combining steps 
5 and 6, and directly defining ipi in terms of F,. However, 
we introduce the intermediate step using CbO[i] and Cbl[j] to 
motivate their central role in our approach. Note that instead 
of —iCb 1 [i], we could combine Cbl[i] and CbO[i] in other 
ways (denoted by COMBlNE(Cb0[i], Cb 1 [i]) within comments 
in Algorithm Q} to get ipi in Step 6. In fact, Jiang et al 0 
compute a Skolem function for Xi in F as an interpolant 
of —'Cbl[*] A CbO[z] and Cbl[i] A —'CbO[7], while Trivedi fl6l 
observes that the function (—'Cbl[z] A (CbO[z] V^)) V (Cbl[i]A 
CbO[i] A h) serves as a Skolem function for Xi in F where 
h and g are arbitrary propositional functions with support in 


Algorithm 1: MonoSkolem 

Input: Prop, formula F(X,Y) = Aj=i Aj)> 

where X = (xi,..., x n ) 

Output: Skolem function vector \P(Y) 

// Phase 1 of algorithm 

1 Factors := j/- 7 : 1 < J < t; 

2 for i in 1 to n do 

3 FactorsWithXi := {/ : / € Factors, Xi £ Supp(/)}; 

^ • A/cFactorsWithXi f 1 

5 CbO[I] := ~>Fi[xi/ 0]; Cb 1 [z] := -iF^Xi/l]; 

6 ipi := -iCbl[i]; 

// Generally, ipi :=Combine (CbO[*],Cb 1 [*]) ; 

7 Factors := (Factors \ FactorsWithXi) U {Fi[xi/ipi]}\ 

// Phase 2 of algorithm 

8 return REVERSESUBSTITUTEftA, ■ ■ ■ , 1pn)\ 


X\{mj}ljY. Since computing interpolants using a SAT solver 
is often time-intensive and does not always lead to succinct 
Skolem functions 0, we simply use ^Cbl[I] as a Skolem 
function in Step 6. Proposition 0 guarantees the correctness of 
this choice. 


Algorithm 2: ReverseSubstitute 
Input: Functions 

^ 1 ( 2 : 2 , ■ •. ,x n ,Y),ip 2 {x 3 ,. . .,X n ,Y), . . .,1pn(Y) 

Output: Function vector y l*(Y) 

1 for i = n downto 2 do 

2 |_ for k = i — 1 downto 1 do ipk = ipk[xi/ipi\ 

3 return \P(Y) = {ipi{Y),.. .,ip n (Y))\ 


Observe that MonoSkolem works with a monolithic 
conjunction (Fp) of factors that have x t in their support. 
Specifically, it composes each such monolithic conjunction 
Fi with a cofactor of A in Step 7 to eliminate quantifiers 
sequentially. This can lead to large memory footprints and 
more time-outs when used with medium to large benchmarks, 
as confirmed by our experiments. This motivates us to ask if 
we can develop a cofactor-based algorithm that does not suffer 
from the above drawbacks of MonoSkolem. 

IV. CEGAR for generating Skolem functions 

We now present a new CEGAR (6J algorithm for generating 
Skolem function vectors, that exploits the factored form of 
F(X,Y). Like MonoSkolem, our new algorithm, named 
CegarSkolem, works in two phases, and assumes that the 
variables in X are ordered by their indices. The first phase 
of the algorithm consists of the core abstraction-refinement 
part, and computes a Skolem function vector (ip 3 ,..., ip n ), 
where ipi has {xi+i,... ,x n } U Y, or a subset thereof, as sup¬ 
port. Unlike in MonoSkolem, this phase avoids composing 
monolithic conjunctions of factors, yielding simpler Skolem 
functions. The second phase of the algorithm performs reverse 
substitutions, similar to that in MonoSkolem. 









Before describing the details of CegarSkolem, we intro¬ 
duce some additional notation and terminology. Given propo¬ 
sitional functions (or formulas) / and <?, we say that / refines 
g and g abstracts f iff / logically implies g. Given F(X,Y) 
and a vector of functions \I/ A = (z/; A , ..., ^ A ), we say that 
is an abstract Skolem function vector for X in F iff there 
exists a Skolem function vector ..., tp n ) for X in F 

such that ijjf- abstracts if>i, for every z G {1,... ,n}. Instead 
of using CbO[i] and Cbl[i] to compute Skolem functions, as 
was done in MonoSkolem, we now use their refinements, 
denoted rO[i] and rl[i] respectively, to compute abstract 
Skolem functions. For convenience, we represent rO[i] and 
rl[i] as sets of implicitly disjoined functions. Thus, if rl[i], 
viewed as a set, is {< 71 , < 72 }, then it is gi V <72 when viewed as 
a function. We abuse notation and use rl[i] (resp., rO[i]) to 
denote a set of functions or their disjunction, as needed. 

A. Overview of our CEGAR algorithm 

Algorithm CegarSkolem has two phases. The first phase 
consists of a CEGAR loop, while the second does reverse 
substitutions. The CEGAR loop has the following steps. 

- Initial abstraction and refinement. This step involves 
constructing refinements of CbO[z] and Cbl[z] for every 
Xi in X. Using Proposition [2] we can then construct an 
initial abstract Skolem function vector \& A . This step is 
implemented in Algorithm [3] (InitAbsRef), which pro¬ 
cesses individual factors of F(X,Y) = Aj=i / J [Xj , Yj ) 
separately, without considering their conjunction. As a 
result, this step is time and memory efficient if the 
individual factors are simple with small representations. 

- Termination Condition. Once InitAbsRef has com¬ 
puted SP A , we check whether V L' A is already a Skolem 
function vector. This is achieved by constructing an 
appropriate propositional formula e, called the “error for¬ 
mula” for \I/ A (details in Subsection II V-Ck and checking 
for its satisfiability. An unsatisfiable formula implies that 

is a Skolem function vector. Otherwise, a satisfying 
assignment n of e is used to improve the current refine¬ 
ments of Cbl[z] and CbO[z] for suitable variables Xi. 

- Counterexample guided abstraction and refinement. 
This step is implemented in Algorithm 0] UpdateAb- 
sRef, and computes an improved (i.e., more abstract) 
refinement of CbO[z] and Cbl[z] for some Xi G X. This, in 
turn, leads to a refinement of the abstract Skolem function 
vector \t A . 

The overall CEGAR loop starts with the first step and repeats 
the second and third steps until a Skolem function vector is 
obtained. We now discuss the three steps in detail. 

B. Initial Abstraction and Refinement 

Algorithm InitAbsRef (see Algorithm 0]) starts by initial¬ 
izing each rl[z] and rO[z], viewed as sets, to the empty set. 
Subsequently, it considers each factor / in A/=i f’’(Xj,Yj), 
and determines the contribution of / to CbO[z] and Cbl[i], for 
every au in the support of /. Specifically, if Xi G Supp(/), 
the contribution of / to CbO[z] is {Sxi ... Xi-\ /) [xi/ 0], and 


Algorithm 3: InitAbsRef 


Input: Prop, formula F(X,Y) = Aj=i f' 1 (Xj, Yj), 
where X = (xi,..., x n ) 

Output: Abstract Skolem function vector 

\P A = {ipi ,..., ip A ), and refinements rO[z] and 
rl[z] for each Xi in X 

1 for i in 1 to n do 

2 |_ rO[z] := 0; rl[z] := 0; // Initializing 

3 for j in 1 to r do 

4 / := f°', // for each factor 

5 for i in 1 to n do 

6 if Xi G Supp(/) then 

7 rO[z] := rojz] U {-‘f[x i / 0 ]}; 

8 rl[z] := rl[z] U {-./[xj/l]}; 

// Skolem function for Xi in / 

9 Aj ■= f[xi/ 1]; 

i° / : = f[xi/tpi,f]-, // V f[xi/ipij] = 3 Xi f 


11 for i in 1 to n do 


12 


: = 


// Interpreting rl[z] as a function 
13 return d > A =(z/! A ,... ,z/j a ) and rO[z],rl[z] VxiGX 


its contribution to Cbl[z] is (-Gag ... Xi-i f ) [xj/1]. These 
contributions are accumulated in the sets rO[z] and rl[z], 
respectively, and is existentially quantified from /. The 
process is then repeated with the next variable in the support 
of /. Once the contributions from all factors are accumulated 
in rO[z] and rl[z] for each Xi in X, InitAbsRef computes 
an abstract Skolem function ijif for each x t in F by com¬ 
plementing rl[z], interpreted as a disjunction of functions. 
Note that executing steps 4 through 10 of InitAbsRef 
for a specific factor / is operationally similar to executing 
steps 1 through 7 of MonoSkolem with a singleton set 
of factors, i.e., Factors = {/}. This highlights the key 
difference between InitAbsRef and MonoSkolem: while 
MonoSkolem works with monolithic conjunctions of factors 
and their compositions, InitAbsRef works with individual 
factors, without ever considering their conjunctions. LemmaQ] 
asserts the correctness of InitAbsRef. 

Lemma 1. The vector v l' A computed by InitAbsRef is an 
abstract Skolem function vector for X in F(X, Y). In addition, 
rO[z] and rl[z] computed by InitAbsRef are refinements of 
CbO[z](.F) and Cbl [i\(F) for every Xi in X. 

Proof. Consider the ordered pair (j, z) of loop indices cor¬ 
responding to the nested loops in steps 3 — 10 and 5 — 10 
of algorithm InitAbsRef. Every update of r0[z] and rl[z] 
in steps 7 and 8 of InitAbsRef can be associated with a 
unique ordered pair of loop indices. Define a linear ordering 
A on the loop index pairs as: (j, i) A ( j',i ') iff j < j', or 
j = j 1 and i < i’. Note that this represents the ordering of 
loop index pairs in successive iterations of the loop in steps 











5 — 10 of InitAbsRef. We use induction on (j. i), ordered by 
A, to show that r0[z] and rl[i], as computed by InitAbsRef, 
are refinements of CbO[z] and Cbl[z]. The base case follows 
from the initialization in steps 1 and 2 of InitAbsRef. To 
prove the inductive step, consider an update of r0[i] and rl[z] 
in steps 7 and 8, respectively, of InitAbsRef. The function 
/ used in steps 7 and 8 is easily seen to be 3xi ... a;,_i f 3 . 
Since f 3 is a factor of F, we also have F => f 3 . It follows 
that 3xi... Xi-i F => 3xi... Xi-\ f 3 = f. Taking the 
contrapositive gives —if =>■ -3xi... Xi-\ F. Therefore, 
-if[xi/a] => {-i3x\.. .Xi-\F)[xi/a] for every propositional 
constant a. Recalling the definitions of Cb0[i] and Cbl[i], 
we get -i/[a:i/0] =Y Cb0[i] and -if[xi/l] => Cbl[1]. By 
the inductive hypothesis, r0[i] and rl[z] are refinements of 
CbO[z] and Cbl[z] prior to executing step 7 of InitAbsRef. 
Therefore, the updated values of r0[z] and rl[z], as computed 
in steps 7 and 8 of InitAbsRef, are also refinements of CbO[z] 
and Cbl[z]. This completes the induction. 

Since r 1 [z] =>• Cbl[z] for every x, in A' when we reach step 
11 of InitAbsRef, it follows from Proposition [2] that tpf = 
-irl[z] abstracts a Skolem function for x, in F. Hence, \E A , 
as computed by InitAbsRef, is an abstract Skolem function 
vector for A' in F. □ 

C. Termination condition 

Given F(X, Y ) and an abstract Skolem function vector \P, 
it may happen that M/ A is already a Skolem function vector 
for X in F. We therefore check if \P A is a Skolem function 
vector before refinement. Towards this end, we define the error 
formula for \P A as F(X',Y) A/\" =1 (iEj <=> ipf) A -iF(X, Y), 
where X'=(x \,..., x' n ) is a sequence of fresh variables with 
no variable in common with A. The first term in the error 
formula checks if there exists some valuation of X that renders 
3YF(X,Y) true. The second term assigns variables in X to 
the values given by the abstract Skolem functions, and the 
third term checks if this assignment falsifies the formula F. 

Lemma 2. The error formula for \1/ A is unsatisfiable iff 'I' A 
is a Skolem function vector of X in F. 

Proof Let e be the error formula for \P A . Suppose £ is 
unsatisfiable. By definition of e, we have 

VFVX'VA |f(AT', Y) =► [X&i ** if) =► F{X, Y) j ^ 

By standard logic transformations, this implies 
VY (3X'F(X',Y) => F'(Y)), where F'(Y) denotes 
(• • ■ (F[xi/tp^]) ■ ■ ■ [Xn/ipn])- Therefore, \P A is a Skolem 
function vector for X in F. 

Suppose 7r is a satisfying assignment of e. By definition 
of e, 7r is a satisfying assignment of F(X\ Y) and of 
/\" =1 (xi <=> -0 A ) A -iF(X,Y), considered separately. Thus, 
the values of x \,..., x n given by z/> A ,..., z/? A respectively, 
cause F to evaluate to 0 for the valuation of Y in 7r. However, 
there exists a valuation of X (viz. same as that of X' in 7r) 
that causes F to evaluate to 1 for the same valuation of Y in 


7r. Hence, V l /A is not a Skolem function vector for X in F, 
as witnessed by the valuation of Y in 7r. □ 

The following example illustrates the role of the error 
formula. 

Example 1. Let X = {xi,x 2 }, Y = {yi, 2 / 2 , 2 / 3 } in 
3 xiX 2 F(X,Y) where F = (/1 A f 2 A ff), with /1 = 
{-iX\\J-ix 2 \/-iyi), f 2 = (x 2 V~iy 3 \/—iy 2 ), f 3 = (xiW-ix 2 Vy 3 ). 

Algorithm InitAbsRef gives rl[l] = [x 2 Ayr), rO[l] = 
(x 2 A -iy 3 ), rl[2] = false, r0[2] = y 3 /\ y 2 . This yields = 
(-ix 2 V —iyi), ij )2 = true. Now, while is a correct Skolem 
function for x\ in F, r/> A is not for x 2 . This is detected by the 
satisfiability of the error formula s = Ffx^jX^Y) A {x\ 

' V _i yi) A (x 2 = 1) A -iFfxi, x 2 , Y). Note that ~^F{~<x 2 V 
—*yi, 1, Y) simplifies to (yi A -iy 3 ), and yi = 1, y 2 = 1, y 3 = 
0,a"i = 0 ,a ;2 = 1, x\ = 0, x' 2 = 0 is a satisfying assignment 
for e. 

D. Counterexample-guided abstraction and refinement 

Let e be the error formula for \P A , and let 7r be a satisfying 
assignment of e. We call 7r a counterexample of the claim that 
is a Skolem function vector. For every variable t)£l'U 
AUF, we use n(v) to denote the value of v in tt. Satisfiability 
of e implies that we need to refine at least one abstract Skolem 
function ipf- in \1/ A to make it a Skolem function vector. Since 
tpf is —ir 1 [z] in our approach, refining ipf can be achieved by 
computing an improved (i.e., more abstract) version of r 1 [i]. 
Algorithm UpdateAbsRef implements this idea by using 7r 
to determine which rl[i] should be rendered abstract by adding 
appropriate functions to rl[z], viewed as a set. 

Before delving into the details of UpdateAbsRef, we 
state some key results. In the following, we use 7r |= / to 
denote that the formula / evaluates to 1 when the variables 
in Supp(/) are set to values given by 7r. If 7r |= /, we 
also say / evaluates to 1 under 7r. We use rO [i\i n it and 
rl [i]i n it to refer to rO[i] and rl[z], as computed by algorithm 
InitAbsRef. Since UpdateAbsRef only adds to rl[i] and 
r0[z] viewed as sets, it is easy to see that rQ[i] inlt r0[z] and 
rl [i]i n it => rl[z] viewed as functions (recall these functions 
are simply disjunctions of elements in the corresponding sets). 

Lemma 3. Let tt be a satisfying assignment of the error 
formula £ for A . Then the following hold. 

(a) tt \= -iCbO[n] V -iCblfn], 

(b) There exists k G {1,..., n — 1} s.t., tt \= rl[fc] A r0[fe]. 

(c) There exists no Skolem function vector 'S' = (tp 1 ,..., ip n ) 
such that tpj tpA for all j in {k + 1,..., n}. 

(d) There exists l € {k + 1,... ,n} such that xi = 1 in tt, 
and tt \= Cbl[(] A -r0[l]. 

Proof Part (a): Consider an assignment tt' of variables in 
A U Y, such that 7 T'{xf) = irfx'f) for all Xi G X, and 
tt' (y : j) = 7r (yj) for all yj G Y. Since tt |= £, by definition of 
£, we have 7r |= F(X',Y). This implies that 7 |= F(X,Y) 
and hence, tt' |= 3x\ ... x n -\ F. If x n = 1 in tt' , we get 
tt' \= (3ati.. .x n -i F) [x„/l], or equivalently, tt' \= -iCbl[n]. 
If x n = 0 in tt', by a similar argument, tt' |= —iCbO [?t] . 



Therefore, 7 r' |= ^Cbl[n] V -iCbO[n], Since x n is the variable 
with the highest index in A', both Cbl[n] and CbO[n] have 
only Y as their support. Since Tv'(yj) = 7r(t/ ? ) for all yj £ Y, 
it follows that 7 r |= -iCbl[n] V -iCbO [n] as well. 

Part (b): Since 7r |= e, by definition of e, we have n |= 
~^F(X, Y). Since F = /\ r q=1 f q , there exists j £ 
such that 7 r \= -i/A Without loss of generality, assume that 
Supp(/ J ) A 0 (otherwise, / J can be removed from /\' q=1 f q ). 
Let Xk be the variable with the smallest index in Supp(/ J ). 
We claim that Xk = 0 in 7r, and prove this by contradiction. 

If possible, let Xk = 1 in 7r. Then, 7r |= ( _, / J )[xfc/l]. Since 
Xk is the lowest indexed variable in Supp(/ ; ), it follows from 
algorithm InitAbsRef that (-i/ 5 ) [ajfc/l] £ rlwhen 
rl [kjinit is viewed as a set. This implies that ( _1 / J )[ a; fc/1] =>• 
rl [kjinit, when rl[k]i n it is viewed as a function. Hence, 7 r |= 
rl [k]init, and since rl [k]i n u =>■ rl[fc], we have 7 r |= rl[fc]. By 
definition of e, we also have n |= (xk +>• ip£), where ip£ = 
-irl[fc]. It follows that Xk = ipk = 0 in 7r. This contradicts 
our assumption (xk = 1). and hence Xk must be 0 in tv. 

Since Xk = 0 in tv, following the same reasoning as above, 
we can show that 7 r |= rO[fc], Furthermore, since 7r |= (a+ O 
ip A ) and ip A = —ir 1 [fc], having Xk = 0 in 7r implies that n \= 
rl[fc]. Hence, 7r |= rO[fc] A rl[fc]. Since rl[fc] =k Cbl[fc] and 
rO[fc] =>■ CbO[fc], we have 7r |= CbO[fc] ACbl[fc] as well. It now 
follows from part (a) that k n and hence k £ {1,..., n— 1} 
Part (c): We prove this by contradiction. If possible, let 
there be a Skolem function vector 'I' such that ipi <0 ipf 
for all i in {k + 1,..., n}. Since 7r |= F(X',Y), it fol¬ 
lows that tv |= 3^!..., x n F. Therefore, by definition of 
Skolem functions, tv A (■ • • (F[xi/ipi\) ■ ■ ■ [x n /ip n ]). Since 
we have assumed ipi ipf for all i in {k + 1,..., n} 
and since n \= A"=i( a '« ^ tpf), it follows that tv |= 
(■ ■ ■ {F[xi/ipi]) ■ ■ ■ [xk/ipk])- However, we know from part 
(b) that 7r |= rO[fc] A rl [k] and hence 7r |= CbO[fc] A Cbl[fc], 
Recalling the definitions of CbOffc] and Cbl[fc], we get n \= 
(-i3xi... Xk F). This contradicts our inference above, i.e., 
7r |= (• • • (F[xi/ipi]) ■ ■ ■ [xk/ipk])- Hence our assumption is 
wrong, i.e., there is no Skolem function vector 'I' such that 
ipi •<=>■ ipf for all i in {k + 1,..., n}. 

Part (d): We prove this by contradiction. If possible, suppose 
xi = 0 in 7T, or 7r )= —iCb 1 [i] VrO[Z] for all( £ {k + 1,..., n}. 
For convenience of notation, let us call this assumption A in 
the discussion below. 

If Xi = 0 in tv, then since 7r |= /\iLi ( x i ^ V’i 4 ) an d 
ipf- = —ir 1 [*] for all i £ {1,..., n}, it follows that tv |= r 1 [Z], 
Since r 1 [Z] =$■ Cb 1 [Z], we have tv |= Cbl [Z] as well. It is also 
easy to see that whenever 7r |= —iCbl [Z], then 7r |= —>r 1[Z] as 
well. Therefore, if xi = 0 in tv or if 7r |= —>Cb 1 [Z], then both 
Cb 1 [Z] and r 1 [Z] evaluate to the same value under tv. 

Consider the subcase of assumption A where xi = 0 in tv, or 
tv j= —iCbl[Z], for alH £ {k + 1,..., n}. From the discussion 
above, either tv |= Cbl[Z]Arl[Z] or 7r |= —iCbl[Z] A->rl[Z] for all 
l £ {k + 1,..., n}. Now consider the Skolem function vector 
given by Proposition U Since ipi = —>Cb 1 [Z] and ipf = 
—>rl[Z], it follows that there exists a Skolem function vector, 
viz. \P, such that ipi O 1 pf for all l in {k + 1,..., n}. This 


Algorithm 4: UpdateAbsRef 


Input: rO[i] and rip] for all in X, 

Satisfying assignment tv of error formula, i.e., 

F( X\ Y) A A”=i {xi & iPf) A nf( A, Y) 
Output: Improved (i.e., refined) \I ,A = (ip^, ..., ipn)> 

Improved (i.e., abstracted) rO[7] & rl[i], Vatj £ X 

1 k := largest m such that tv satisfies rO[m] A rl[m]; 

2 hq := Generalize(7t, rO[fe]); 

3 fii := GENERALIZE(7r, rlffej); 

4 y := fi 0 A yy, 

// Search for Skolem function among 
{^fc+i’ ■ • ■ ’ } to be refined 

Z := k + 1; 


5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 


while true do // current guess: refine ipf 

if xi £ Supp(/r) then 
if xi = 1 in tv then 
yi := y[xi/1 ]; 
rl[Z] := r 1 [Z] U {yi}\ 
if 7T satisfies rO[Z] then 

yo := GENERALIZE(7r, r0[Z]); 

y-=yo a yi, 

else 

]_ break; 


else 


Mo := tA. x l/ 0 ]; 
rO[Z] := rO[Z] U {^ 0 }; 
yi := Generalize^, r 1 [Z]); 
M : = Mo A yy, 


21 

22 


Z := Z + 1 


23 return r 0 [i] and rl[*] for all Xi in X, and SP A 


contradicts the assertion in part (c) above. Hence we cannot 
have Xi = 0 in 7r or 7r |= —>Cbl[Z], for all Z £ {k + 1,..., n}. 

If assumption A has to hold, there must therefore exist some 
l £ {k + 1,... ,n} such that Xi = 1 in m and 7r A Cbl[Z] A 
r0[Z]. Since rO[Z] => CbO[Z], we must have tv A Cb 1 [Z] ACbO[Z] 
in this case. From part (a), we know that tv A _l CbO[n] A 
->Cbl[n]. It follows that l is strictly less than n, and we can 
repeat the entire argument above with assumption A restricted 
to indices in {Z + 1,..., n}. Note that {Z + 1,..., n} is non¬ 
empty (since l < n), and is a strict subset of {k + 1,..., n} 
(since Z £ {k + 1,..., n}). Therefore, restricting assumption 
A to smaller subsets of indices can only be done finitely many 
times, after which there won’t be any l in the set of indices 
under consideration such that Xi = 1 in 7r and tt A Cb 1 [Z] A 
r0 [Z]. This shows that assumption A is false, thereby proving 
the assertion in part (d). □ 

Algorithm[4 ] (UpdateAbsRef) uses Lemma[3]to compute 
abstract versions of r0[j] and rl[z], and a refined version of 
SP A , when V P A is not a Skolem function vector. It takes as 













input the current versions of r0[«] and rl[i] for allin X, and 
a satisfying assignment n of the error formula for the current 
version of \f' A . Since 7 r |= F(X', Y) and 7r |= -<F(X, Y), and 
since the value of every Xi in n is given by ^> A , there exists 
at least one for l £ {l,...,n}, that fails to generate 
the right value of xi when the value of Y is as given by 7r. 
UpdateAbsRef works by identifying such an index Z and 
refining ipf ■ Since ipf = —t 1 [i], ipf is refined by updating 
(abstracting) the corresponding rl[Z] set. In fact, the algorithm 
may, in general, end up abstracting not only r 1 [Z], but several 
rO[i] and rl[i] as well in a sound manner. 

As shown in Algorithm @] UpdateAbsRef first finds the 
largest index k such that 7r \= rO[fc] A rl[Zc]. Lemma (3J) 
guarantees the existence of such an index in {1,... ,n}. We 
assume access to a function called Generalize that takes 
as arguments an assignment n and a function <p such that 
7r 1= if, and returns a function £ that generalizes 7 r while 
satisfying p. More formally, if £ = Generalize^, p), 
then Supp(£) C Supp(</?), 7r |= £ and £ =>■ p (details of 
Generalize used in our implementation are discussed later). 
Thus, in steps 2 and 3 of UpdateAbsRef, we compute 
generalizations of n that satisfy rO[fc] and rl[fc], respectively. 
The function p computed in step 4 is therefore such that n \= p 
and p => rO[fc] A rl[fc]. Since rO[fc] A rl[fc] => -3x\... XkF, 
any abstract Skolem function vector that produces values of 
... ,x n (given the valuation of Y as in 7 r) for which p 
evaluates to 1, cannot be a Skolem function vector. Since the 
support of p is {tc/c+i,..., x n }U Y, one of the abstract Skolem 
functions Yk+i ’..., Yn must be refined. 

The loop in steps 6-21 of UpdateAbsRef tries to identify 
an abstract Skolem function Yf to be refined, by iterating Z 
from k + 1 to n. Clearly, if xi (/L Supp(/r), the value of Yt 
under 7r is of no consequence in evaluating p, and we ignore 
such variables. If xi £ Supp(/i) and if xi = 1 in 7 r, then 7r |= 
p[xi/ 1] and p[xi/l] => (-i3xi • • • xi-\F)[xi/l\. Recalling the 
definition of Cbl[Z], we have p[xi/ 1] => Cbl[Z], and therefore 
p[xi/l\ can be added to r 1 [Z] (viewed as a set) yielding a 
more abstract version of r 1 [Z]. Steps 8-10 of UpdateAbsRef 
implement this update of r 1 [Z]. Note that since 7 r |= p[xi/ 1], 
we have 7 r |= r 1 [Z] after step 10. If it so happens that 7 r |= 
r0 [Z] as well, then we have n |= rO[Z] A r 1 [Z], where r 1 [Z] 
refers to the updated refinement of Cb 1 [Z]. In this case, we 
have effectively found an index Z > k such that 7r |= rO[fc] A 
rl[fc]. We can therefore repeat our algorithm starting with Z 
instead of k. Steps 11-13 followed by step 21 of algorithm 
UpdateAbsRef effectively implement this. If, on the other 
hand, n Y= rO[fc], then we have found an Z that satisfies the 
conditions in Lemma [3jl. We exit the search for an abstract 
Skolem function in this case (see steps 14-15). 

If xi = 0 in 7T, a similar argument as above shows that 
p\xi/ 0] can be added to rO[Z]. Steps 17-18 of UpdateAb¬ 
sRef implement this update. As before, it is easy to see that 
7T |= rO [Z] after step 18. Moreover, since 7r |= <=> Yt) 

and Yt = t 1[Z], i n order to have xi = 0 in n, we must 
have 7T |= rl[Z]. Therefore, we have once again found an 
index Z > k such that 7r |= rO[fc] A rl[fc], and can repeat 


our algorithm starting with Z instead of k. Steps 19-21 of 
algorithm UpdateAbsRef effectively implement this. 

Once we exit the loop in steps 6-21 of UpdateAbsRef, 
we compute the refined Skolem function vector 'L A as 
(—>r 1 [1 ],... -irl[n]) in step 22 and return the updated r0[i], 
rl[i] for all a \ in X, and also \P A . 

Lemma 4. Algorithm UpdateAbsRef always terminates, 
and renders at least one rl[i] strictly abstract, and at least 
one Yt strictly refined, for i £ {1,..., n}. 

Proof. By Lemma |3ji, we know that 7 r |= -iCbO[ra] V -iCbl[n], 
and therefore 7 r |= -rO[n] V -irl [n\. Since steps 12-13 or 17- 
20 of UpdateAbsRef can be executed only when 7r |= rO[Z]A 
r 1 [Z], and since Z is incremented in every iteration of the loop 
in steps 6-21, it follows that steps 14-15 must be executed 
for some Z < n. Therefore, algorithm UpdateAbsRef always 
terminates. 

It is easy to see from the pseudocode of algorithm Update¬ 
AbsRef that steps 7-10 and 14-15 must be executed before 
exiting the while loop (steps 6-21) and terminating. Before 
executing step 10, we have xi = 1 in 7r and n \= A™=i ( x i ^ 
Yf) ■ Since ipf = —>r 1 [Z] before step 10, with xi = 1 in n. 
it must be the case that n \= —>r 1 [Z] before step 10. However, 
since 7r |= p[xi/l] in step 9, we have 7r|=rl[Z] after step 
10. Therefore, executing step 10 renders r 1 [Z] strictly abstract 
than what it was earlier. This also implies that fi>f = i s 

strictly refined when UpdateAbsRef returns in step 23. □ 

Example 1 (Continued). Continuing with our earlier example, 
the error formula after the first step has a satisfying assign¬ 
ment yi = 1 ,7/2 = 1,2/3 = 0, x\ = 0, X2 = 1, x[ = 0, x ’ 2 = 0 . 
Using this for 7 r in UpdateAbsRef, we find that Yi 
left unchanged at ( _ 'a;2 V while Y2’ which was true 

earlier, is refined to (~>y 1 V yf). With these refined Skolem 
functions, Y) evaluates to true for all valuations of 

Y. As a result, the (new) error formula becomes unsatisfiable, 
confirming the correctness of the Skolem functions. 

The overall CegarSkolem algorithm can now be imple¬ 
mented as depicted in Algorithm^ From the above discussion 
and Lemmas □ IHandEO we obtain our main result. 

Theorem 1. CegarSkolem(F(X, Y)) terminates and com¬ 
putes a Skolem function vector for X in F. 

Proof. By Lemma |4] we know that every invocation of Up¬ 
dateAbsRef renders at least one rl[i] strictly abstract than 
what it was earlier. Since rl[z] is a propositional function, 
it has finitely many minterms and can be rendered strictly 
abstract only finitely many times. From Proposition [2] we 
also know that (—iCbl[l],..., -iCbl[n]) is indeed a Skolem 
function vector, and therefore by Lemma[2] its error formula is 
unsatisfiable. The termination of CegarSkolem follows im¬ 
mediately from the above observations. Since e is unsatisfiable 
when CegarSkolem terminates, it follows from Lemma [2] 
that the vector of functions returned is a Skolem function 
vector for A' in F. □ 


Algorithm 5: CegarSkolem 
Input: Propositional formula 

F(X, Y) = A; =1 X = ( Xl ,...,x n ) 

Output: Skolem function vector 'I'(F) for X in F 

1 (\P A ,{rO[?'], rl[z] : 1 < * < n }) := 

InitAbsRef(A^ =1 A); 

2 £ := F[X',Y) A A?=i(*i A ~^F(X,Y)- 

3 while £ is satisfiable do 

4 Let 7r be a satisfying assignment of e; 

5 (VP A , {rO[i],rl[i] : 1 < * < n}) := 
UPDATEABSREF({rO[*],rl[i] : 1 < i < n }, 7r); 

6 [ £ := F(X', Y) A A”=i (®i A F); 

7 >I>(F) := ReverseSubstitute(— r 1 [1],..., -irl[n]); 

8 return \I>(F); 


The function Generalize^, ip) used in UpdateAbsRef 
can be implemented in several ways. Since 7r |= ip, we may 
return a conjunction of literals corresponding to the assignment 
7 r, or the function (p itself. From our experiments, it appears 
that the first option leads to low memory requirements and 
increased run-time (due to large number of invocations of 
UpdateAbsRef). The other option requires more memory 
and less run-time due to fewer invocations of UpdateAb¬ 
sRef. For our study, we let Generalize^, rl[fc]) return 
one element in rl[k] (viewed as a set) amongst all those that 
evaluate to 1 under 7r, such that the support of // computed 
in Algorithm UpdateAbsRef is minimized (we had to allow 
Generalize^, •) access to p for this purpose). We follow a 
similar strategy for Generalize^, rO[fc]). This gives us a 
reasonable tradeoff between time and space requirements. 

V. Experimental Results 
A. Experimental Methodology 

We compared CegarSkolem with (a) MonoSkolem 
(the algorithm based on the cofactoring approach of Q. ED) 
and with (b) Bloqqer (a QRAT-based Skolem function gen¬ 
eration tool reported in fTTI ). As described in ifTTI . Bloqqer 
generates Skolem functions by first generating QRAT proofs 
using a remarkably efficient (albeit incomplete) preprocessor, 
and then generates Skolem functions from these proofs. 

The Skolem function generation benchmarks were obtained 
by considering sequential circuits from the HWMCC10 bench¬ 
mark suite, and by reducing the problem of disjunctively 
decomposing a circuit into components to the problem of 
generating Skolem function vectors. Details of how these 
benchmarks were generated are described in 0Q. Each bench¬ 
mark is of the form 3 XF(X,Y), where F(X,Y) is a con¬ 
junction of factors and 3Y(3XF(X,Y)) is true. However, 
for some benchmarks, WY(3XF(X,Y)) does not evaluate to 
true. Since Bloqqer can generate Skolem functions only when 
VY(3XF(X, F)) is true, we divided the benchmarks into 
two categories: a) TYPE-1 where VY3XF(X, Y) is true, 
and b) TYPE-2 where YY3XF(X,Y) is false (although 
3Y3XF(X,Y) is true). While we ran CegarSkolem and 


MonoSkolem on all benchmarks, we ran Bloqqer only on 
TYPE-1 benchmarks. Further, since Bloqqer required the 
input to be in qdimacs format, we converted each TYPE-1 
benchmark into qdimacs format using Tseitin encoding m- 
All our benchmarks can be downloaded from m. 

Our implementations of MonoSkolem and 
CegarSkolem make use of the ABC ED library 
to represent and manipulate functions as AIGs. For 
CegarSkolem, we used the default SAT solver provided 
by ABC, which is a variant of MiniSAT. We used a simple 
heuristic to order the variables, and used the same ordering for 
both MonoSkolem and CegarSkolem. In our ordering, 
variables that occur in fewer factors are indexed lower than 
those that occur in more factors. 

We used the following metrics to compare the performance 
of the algorithms: (i) average/maximum size of the generated 
Skolem functions in a Skolem function vector, where the 
size is the number of nodes in the AIG representation of 
a function, and ii) total time taken to generate the Skolem 
function vector (excluding any input format conversion time). 
The experiments were performed on a 1.87 GHz Intel(R) 
Xeon machine with 128GB memory running Ubuntu 12.04.4. 
The maximum time and main memory usage was restricted 
to 2 hours and 32GB, although we noticed that for most 
benchmarks, all three algorithms used less than 2 GB memory. 

B. Results and Discussion 

We conducted our experiments with 424 benchmarks, of 
which 160 were TYPE-1 benchmarks and 264 were TYPE-2 
benchmarks. The 424 benchmarks covered a wide spectrum 
in terms of number of factors, total number of variables, and 
number of quantified variables. For instance, in the TYPE-1 
category, the number of factors varied from 44 to 7034, total 
number of variables varied from 94 to 9782 and the number 
of variables to eliminate varied from 60 to 4751. Amongst 
the TYPE-2 benchmarks, the number of factors varied varied 
from 24 to 3956, the total number of varibles varied from 70 
to 5963, and the variables to eliminate varied 21 to 2689. 

1) CegarSkolem vs MonoSkolem: The performance 
of these two algorithms on all the benchmarks (TYPE-1 
and TYPE-2) is shown in the scatter plots of Figure Q] 
where Figure [Ta] shows the average sizes of Skolem func¬ 
tions generated in a Skolem function vector and Figure [Tb] 
shows the total time taken in seconds. From Figure [Ta] it is 
clear that the Skolem functions generated by CegarSkolem 
in a Skolem function vector are on average smaller than 
those generated by MonoSkolem. There is no instance on 
which CEGARSKOLEM generates Skolem function vectors with 
larger functions on average vis-a-vis MONOSKOLEM. 

Due to repeated calls to the SAT-solver, CegarSkolem 
takes more time than MonoSkolem on some benchmarks, 
but on most of them the total time taken by both algorithms 
is less than 100 seconds (Figure [IB. Indeed, on profiling 
we found that CegarSkolem spent most of its time on 
SAT solving. On 38 benchmarks where CegarSkolem took 
greater than 100 but less than 300 seconds, MonoSkolem 








Fig. 1: CegarSkolem vs MonoSkolem on TYPE-1 & TYPE-2 
benchmarks. Topmost (rightmost) points indicate benchmarks where 
MonoSkolem (CegarSkolem) was unsuccessful. 



(a) Maximum size of Skolem functions 



Time in CegarSkolem 

(b) Time taken (in seconds) 

Fig. 2: CegarSkolem vs Bloqqer on TYPE-1 benchmarks. 
Topmost (rightmost) points indicate benchmarks for which Bloqqer 
(CegarSkolem) was unsuccessful. 


performed significantly worse, taking more than 1000 seconds. 
We found the degradation of MonoSkolem was due to 
the large sizes of Skolem functions generated (of the order 
of 1 million AIG nodes) compared to those generated by 
CegarSkolem (< 8000 AIG nodes). Large Skolem function 
sizes clearly imply more time spent in function composition 
and reverse-substitution. 

For benchmarks where the sizes of Skolem functions gen¬ 
erated were even larger (of the order of 10' AIG nodes), 
MonoSkolem could not complete generation of all Skolem 
functions: for 8 benchmarks, the memory consumed by 
MonoSkolem increased rapidly, resulting in memory outs; 
for 10 benchmarks, it ran out of time; for an overwhelming 
83 benchmarks, it encountered integer overflows (and hence 
assertion failures) in the underlying ABC library. These are 
indicated by the topmost points (see label “FA” on the axes) 
in Figure [T] In contrast, CegarSkolem generated Skolem 
functions for almost all (412/424) benchmarks. The rightmost 
points indicate the 12 cases where CegarSkolem failed, of 
which 10 were time-outs and 2 were memory outs. 

2) CegarSkolem vs Bloqqer: Of the 160 TYPE-1 
benchmarks, Bloqqer successfully generated Skolem function 


vectors in 148 cases. It gave a NOT VERIFIED message for 
the remaining 12 benchmarks (in less than 30 minutes). These 
benchmarks are indicated by the topmost points (see label 
“FA” on the axes) in the scatter plots of Figure [2] Of these, 
8 are large benchmarks with 1000+ factors and variables to 
eliminate (overall, there are 9 such large benchmarks). On the 
other hand, CegarSkolem was able to successfully generate 
Skolem functions on 154 benchmarks, including the 9 large 
benchmarks, on each of which it took less than 20 minutes. 

For the 142 benchmarks for which both algorithms suc¬ 
ceeded, we compared the times taken in Figure [2b] As earlier, 
CegarSkolem took more time on many benchmarks, but 
there were several benchmarks, including the large bench¬ 
marks, on which Bloqqer was out-performed. We also com¬ 
pared the maximum sizes of Skolem functions generated in a 
Skolem function vector (see Figure[2a]i. We used the maximum 
(instead of average) size, since Tseitin encoding was needed 
to convert the benchmarks to qdimacs format, and this in¬ 
troduces many variables whose Skolem function sizes are very 
small, skewing the average. For a majority (108/142) of the 
benchmarks where both algorithms succeeded, the maximum 
sizes of Skolem functions obtained by CegarSkolem were 










smaller than those generated by Bloqqer. Hence, not only does 
CEGARSKOLEM run faster on the large benchmarks, it also 
generates smaller Skolem functions on most of them. 

3) Discussion: For all benchmarks on which 

CegarSkolem timed out, we noticed that there were 
large subsets of factors that shared many variables in their 
supports. As a result, CegarSkolem could not exploit 
the factored representation effectively, requiring many 
refinements. We also noticed that for many benchmarks 
(197/424), the initial abstract Skolem functions were correct, 
and most of the time was spent in the SAT solver. In fact, on 
averaging over all benchmarks, we found that around 33% of 
the time spent by CegarSkolem was for SAT-solving. This 
shows that we can leverage improvements in SAT solving 
technology to improve the performance of CegarSkolem. 

VI. Conclusion and Future Work 

We presented a CEGAR algorithm for generating Skolem 
functions from factored propositional formulas. Our experi¬ 
ments show that for complex functions, our algorithm out¬ 
performs two state-of-the-art algorithms. As part of future 
work, we will explore integration with more efficient SAT- 
solvers and refinement using multiple counter-examples. 
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